Skip to content
This repository has been archived by the owner on Jul 23, 2024. It is now read-only.

[Task Submission] BLM_tasks (blm_tasks) #14

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

CLCL-Geneva
Copy link

blm_tasks

BLM_tasks -- Blackbird Language Matrices -- aim to measure rule-like generalization in neural networks, through synthetic datasets that are generated based on combinations of rules relevant to particular targeted grammatical phenomena

Authors

  • Paola Merlo Paola.Merlo@unige.ch
  • Chunyang Jiang Chunyang.Jiang@unige.ch
  • Giuseppe Samo Giuseppe.Samo@unige.ch
  • Vivi Nastase vivi.a.nastase@gmail.com

Implementation

task.py implements:

  • evaluate_predictions: to adjust the computation of the F1 score for the correct answer in our multiple choice setting
  • format_example: to adjust mapping the input to the requested formatting

Usage

The adjusted evaluate_predictions method takes care of particularities of our set-up.

Checklist:

  • [x ] I and my co-authors agree that, if this PR is merged, the code will be available under the same license as the genbench_cbt repository.
  • [ x] Prior to submitting, I have ran the GenBench CBT test suite using the genbench-cli test-task tool.
  • [ x] I have read the description of what should be in the doc.md of my task, and have added the required arguments.
  • [ x] I have submitted or will submit an accompanying paper to the GenBench workshop.

@vernadankers
Copy link
Contributor

vernadankers commented Sep 1, 2023

Hello!

We are getting quite close to the deadline: Please don't forget to make any final changes to your PR if required, and submit your accompanying paper to Openreview via https://openreview.net/group?id=GenBench.org/2023/Workshop by September 1.

Good luck finalising your PR and paper, feel free to tag us if you have questions.
Cheers, Verna
On behalf of the GenBench team

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants